Overview

Dataset statistics

Number of variables33
Number of observations395
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory443.9 KiB
Average record size in memory1.1 KiB

Variable types

NUM13
CAT12
BOOL8

Reproduction

Analysis started2020-07-28 15:28:40.908266
Analysis finished2020-07-28 15:29:15.243075
Duration34.33 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

G3 is highly correlated with G2High correlation
G2 is highly correlated with G3High correlation
absences has 115 (29.1%) zeros Zeros
G2 has 13 (3.3%) zeros Zeros
G3 has 38 (9.6%) zeros Zeros

Variables

school
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
GP
349
MS
 
46
ValueCountFrequency (%) 
GP34988.4%
 
MS4611.6%
 

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
G34944.2%
 
P34944.2%
 
M465.8%
 
S465.8%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter790100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
G34944.2%
 
P34944.2%
 
M465.8%
 
S465.8%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin790100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
G34944.2%
 
P34944.2%
 
M465.8%
 
S465.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII790100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
G34944.2%
 
P34944.2%
 
M465.8%
 
S465.8%
 

sex
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
F
208
M
187
ValueCountFrequency (%) 
F20852.7%
 
M18747.3%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
F20852.7%
 
M18747.3%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter395100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F20852.7%
 
M18747.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin395100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
F20852.7%
 
M18747.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
F20852.7%
 
M18747.3%
 

age
Real number (ℝ≥0)

Distinct count8
Unique (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.696202531645568
Minimum15
Maximum22
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum15
5-th percentile15
Q116
median17
Q318
95-th percentile19
Maximum22
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.276042725
Coefficient of variation (CV)0.07642712301
Kurtosis-0.001221778069
Mean16.69620253
Median Absolute Deviation (MAD)1
Skewness0.4662701614
Sum6595
Variance1.628285035
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1610426.3%
 
179824.8%
 
188220.8%
 
158220.8%
 
19246.1%
 
2030.8%
 
2210.3%
 
2110.3%
 
ValueCountFrequency (%) 
158220.8%
 
1610426.3%
 
179824.8%
 
188220.8%
 
19246.1%
 
2030.8%
 
2110.3%
 
2210.3%
 
ValueCountFrequency (%) 
2210.3%
 
2110.3%
 
2030.8%
 
19246.1%
 
188220.8%
 
179824.8%
 
1610426.3%
 
158220.8%
 

address
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
U
307
R
88
ValueCountFrequency (%) 
U30777.7%
 
R8822.3%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
U30777.7%
 
R8822.3%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter395100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
U30777.7%
 
R8822.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin395100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
U30777.7%
 
R8822.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
U30777.7%
 
R8822.3%
 

famsize
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
GT3
281
LE3
114
ValueCountFrequency (%) 
GT328171.1%
 
LE311428.9%
 

Length

Max length3
Median length3
Mean length3
Min length3

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
339533.3%
 
G28123.7%
 
T28123.7%
 
L1149.6%
 
E1149.6%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter79066.7%
 
Decimal Number39533.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
G28135.6%
 
T28135.6%
 
L11414.4%
 
E11414.4%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3395100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin79066.7%
 
Common39533.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
G28135.6%
 
T28135.6%
 
L11414.4%
 
E11414.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
3395100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1185100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
339533.3%
 
G28123.7%
 
T28123.7%
 
L1149.6%
 
E1149.6%
 

Pstatus
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
T
354
A
 
41
ValueCountFrequency (%) 
T35489.6%
 
A4110.4%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
T35489.6%
 
A4110.4%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter395100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T35489.6%
 
A4110.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin395100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
T35489.6%
 
A4110.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
T35489.6%
 
A4110.4%
 

Medu
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.749367088607595
Minimum0
Maximum4
Zeros3
Zeros (%)0.8%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.094735141
Coefficient of variation (CV)0.3981771463
Kurtosis-1.09001438
Mean2.749367089
Median Absolute Deviation (MAD)1
Skewness-0.3183806885
Sum1086
Variance1.19844503
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
413133.2%
 
210326.1%
 
39925.1%
 
15914.9%
 
030.8%
 
ValueCountFrequency (%) 
030.8%
 
15914.9%
 
210326.1%
 
39925.1%
 
413133.2%
 
ValueCountFrequency (%) 
413133.2%
 
39925.1%
 
210326.1%
 
15914.9%
 
030.8%
 

Fedu
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5215189873417723
Minimum0
Maximum4
Zeros2
Zeros (%)0.5%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.088200546
Coefficient of variation (CV)0.4315654775
Kurtosis-1.198538762
Mean2.521518987
Median Absolute Deviation (MAD)1
Skewness-0.03167209444
Sum996
Variance1.184180428
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
211529.1%
 
310025.3%
 
49624.3%
 
18220.8%
 
020.5%
 
ValueCountFrequency (%) 
020.5%
 
18220.8%
 
211529.1%
 
310025.3%
 
49624.3%
 
ValueCountFrequency (%) 
49624.3%
 
310025.3%
 
211529.1%
 
18220.8%
 
020.5%
 

Mjob
Categorical

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
other
141
services
103
at_home
59
teacher
58
health
34
ValueCountFrequency (%) 
other14135.7%
 
services10326.1%
 
at_home5914.9%
 
teacher5814.7%
 
health348.6%
 

Length

Max length8
Median length7
Mean length6.460759494
Min length5

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e55621.8%
 
h32612.8%
 
r30211.8%
 
t29211.4%
 
s2068.1%
 
o2007.8%
 
c1616.3%
 
a1515.9%
 
v1034.0%
 
i1034.0%
 
_592.3%
 
m592.3%
 
l341.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter249397.7%
 
Connector Punctuation592.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e55622.3%
 
h32613.1%
 
r30212.1%
 
t29211.7%
 
s2068.3%
 
o2008.0%
 
c1616.5%
 
a1516.1%
 
v1034.1%
 
i1034.1%
 
m592.4%
 
l341.4%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_59100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin249397.7%
 
Common592.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e55622.3%
 
h32613.1%
 
r30212.1%
 
t29211.7%
 
s2068.3%
 
o2008.0%
 
c1616.5%
 
a1516.1%
 
v1034.1%
 
i1034.1%
 
m592.4%
 
l341.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
_59100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2552100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e55621.8%
 
h32612.8%
 
r30211.8%
 
t29211.4%
 
s2068.1%
 
o2007.8%
 
c1616.3%
 
a1515.9%
 
v1034.0%
 
i1034.0%
 
_592.3%
 
m592.3%
 
l341.3%
 

Fjob
Categorical

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
other
217
services
111
teacher
 
29
at_home
 
20
health
 
18
ValueCountFrequency (%) 
other21754.9%
 
services11128.1%
 
teacher297.3%
 
at_home205.1%
 
health184.6%
 

Length

Max length8
Median length5
Mean length6.136708861
Min length5

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e53522.1%
 
r35714.7%
 
h30212.5%
 
t28411.7%
 
o2379.8%
 
s2229.2%
 
c1405.8%
 
v1114.6%
 
i1114.6%
 
a672.8%
 
_200.8%
 
m200.8%
 
l180.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter240499.2%
 
Connector Punctuation200.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e53522.3%
 
r35714.9%
 
h30212.6%
 
t28411.8%
 
o2379.9%
 
s2229.2%
 
c1405.8%
 
v1114.6%
 
i1114.6%
 
a672.8%
 
m200.8%
 
l180.7%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_20100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin240499.2%
 
Common200.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e53522.3%
 
r35714.9%
 
h30212.6%
 
t28411.8%
 
o2379.9%
 
s2229.2%
 
c1405.8%
 
v1114.6%
 
i1114.6%
 
a672.8%
 
m200.8%
 
l180.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
_20100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2424100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e53522.1%
 
r35714.7%
 
h30212.5%
 
t28411.7%
 
o2379.8%
 
s2229.2%
 
c1405.8%
 
v1114.6%
 
i1114.6%
 
a672.8%
 
_200.8%
 
m200.8%
 
l180.7%
 

reason
Categorical

Distinct count4
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
course
145
home
109
reputation
105
other
36
ValueCountFrequency (%) 
course14536.7%
 
home10927.6%
 
reputation10526.6%
 
other369.1%
 

Length

Max length10
Median length6
Mean length6.420253165
Min length4

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o39515.6%
 
e39515.6%
 
r28611.3%
 
u2509.9%
 
t2469.7%
 
c1455.7%
 
s1455.7%
 
h1455.7%
 
m1094.3%
 
p1054.1%
 
a1054.1%
 
i1054.1%
 
n1054.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2536100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o39515.6%
 
e39515.6%
 
r28611.3%
 
u2509.9%
 
t2469.7%
 
c1455.7%
 
s1455.7%
 
h1455.7%
 
m1094.3%
 
p1054.1%
 
a1054.1%
 
i1054.1%
 
n1054.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2536100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o39515.6%
 
e39515.6%
 
r28611.3%
 
u2509.9%
 
t2469.7%
 
c1455.7%
 
s1455.7%
 
h1455.7%
 
m1094.3%
 
p1054.1%
 
a1054.1%
 
i1054.1%
 
n1054.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2536100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o39515.6%
 
e39515.6%
 
r28611.3%
 
u2509.9%
 
t2469.7%
 
c1455.7%
 
s1455.7%
 
h1455.7%
 
m1094.3%
 
p1054.1%
 
a1054.1%
 
i1054.1%
 
n1054.1%
 

guardian
Categorical

Distinct count3
Unique (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
mother
273
father
90
other
 
32
ValueCountFrequency (%) 
mother27369.1%
 
father9022.8%
 
other328.1%
 

Length

Max length6
Median length6
Mean length5.918987342
Min length5

Overview of Unicode Properties

Unique unicode characters8
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t39516.9%
 
h39516.9%
 
e39516.9%
 
r39516.9%
 
o30513.0%
 
m27311.7%
 
f903.8%
 
a903.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2338100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t39516.9%
 
h39516.9%
 
e39516.9%
 
r39516.9%
 
o30513.0%
 
m27311.7%
 
f903.8%
 
a903.8%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2338100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t39516.9%
 
h39516.9%
 
e39516.9%
 
r39516.9%
 
o30513.0%
 
m27311.7%
 
f903.8%
 
a903.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2338100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t39516.9%
 
h39516.9%
 
e39516.9%
 
r39516.9%
 
o30513.0%
 
m27311.7%
 
f903.8%
 
a903.8%
 

traveltime
Categorical

Distinct count4
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
1
257
2
107
3
 
23
4
 
8
ValueCountFrequency (%) 
125765.1%
 
210727.1%
 
3235.8%
 
482.0%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
125765.1%
 
210727.1%
 
3235.8%
 
482.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number395100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
125765.1%
 
210727.1%
 
3235.8%
 
482.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common395100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
125765.1%
 
210727.1%
 
3235.8%
 
482.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
125765.1%
 
210727.1%
 
3235.8%
 
482.0%
 

studytime
Categorical

Distinct count4
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2
198
1
105
3
65
4
 
27
ValueCountFrequency (%) 
219850.1%
 
110526.6%
 
36516.5%
 
4276.8%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
219850.1%
 
110526.6%
 
36516.5%
 
4276.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number395100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
219850.1%
 
110526.6%
 
36516.5%
 
4276.8%
 

Most occurring scripts

ValueCountFrequency (%) 
Common395100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
219850.1%
 
110526.6%
 
36516.5%
 
4276.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
219850.1%
 
110526.6%
 
36516.5%
 
4276.8%
 

failures
Categorical

Distinct count4
Unique (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
0
312
1
 
50
2
 
17
3
 
16
ValueCountFrequency (%) 
031279.0%
 
15012.7%
 
2174.3%
 
3164.1%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
031279.0%
 
15012.7%
 
2174.3%
 
3164.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number395100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
031279.0%
 
15012.7%
 
2174.3%
 
3164.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common395100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
031279.0%
 
15012.7%
 
2174.3%
 
3164.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
031279.0%
 
15012.7%
 
2174.3%
 
3164.1%
 

schoolsup
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
no
344
yes
 
51
ValueCountFrequency (%) 
no34487.1%
 
yes5112.9%
 

famsup
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
yes
242
no
153
ValueCountFrequency (%) 
yes24261.3%
 
no15338.7%
 

paid
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
no
214
yes
181
ValueCountFrequency (%) 
no21454.2%
 
yes18145.8%
 

activities
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
yes
201
no
194
ValueCountFrequency (%) 
yes20150.9%
 
no19449.1%
 

nursery
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
yes
314
no
81
ValueCountFrequency (%) 
yes31479.5%
 
no8120.5%
 

higher
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
yes
375
no
 
20
ValueCountFrequency (%) 
yes37594.9%
 
no205.1%
 

internet
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
yes
329
no
66
ValueCountFrequency (%) 
yes32983.3%
 
no6616.7%
 

romantic
Boolean

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
no
263
yes
132
ValueCountFrequency (%) 
no26366.6%
 
yes13233.4%
 

famrel
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9443037974683546
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile2
Q14
median4
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8966586077
Coefficient of variation (CV)0.2273300064
Kurtosis1.139772294
Mean3.944303797
Median Absolute Deviation (MAD)1
Skewness-0.9518816901
Sum1558
Variance0.8039966587
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
419549.4%
 
510626.8%
 
36817.2%
 
2184.6%
 
182.0%
 
ValueCountFrequency (%) 
182.0%
 
2184.6%
 
36817.2%
 
419549.4%
 
510626.8%
 
ValueCountFrequency (%) 
510626.8%
 
419549.4%
 
36817.2%
 
2184.6%
 
182.0%
 

freetime
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2354430379746835
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9988620397
Coefficient of variation (CV)0.3087249653
Kurtosis-0.3018073698
Mean3.235443038
Median Absolute Deviation (MAD)1
Skewness-0.163350753
Sum1278
Variance0.9977253743
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
315739.7%
 
411529.1%
 
26416.2%
 
54010.1%
 
1194.8%
 
ValueCountFrequency (%) 
1194.8%
 
26416.2%
 
315739.7%
 
411529.1%
 
54010.1%
 
ValueCountFrequency (%) 
54010.1%
 
411529.1%
 
315739.7%
 
26416.2%
 
1194.8%
 

goout
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.108860759493671
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.113278174
Coefficient of variation (CV)0.3580984355
Kurtosis-0.770250241
Mean3.108860759
Median Absolute Deviation (MAD)1
Skewness0.1165024169
Sum1228
Variance1.239388293
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
313032.9%
 
210326.1%
 
48621.8%
 
55313.4%
 
1235.8%
 
ValueCountFrequency (%) 
1235.8%
 
210326.1%
 
313032.9%
 
48621.8%
 
55313.4%
 
ValueCountFrequency (%) 
55313.4%
 
48621.8%
 
313032.9%
 
210326.1%
 
1235.8%
 

Dalc
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.481012658227848
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile3
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8907414281
Coefficient of variation (CV)0.6014407933
Kurtosis4.759492465
Mean1.481012658
Median Absolute Deviation (MAD)0
Skewness2.190761845
Sum585
Variance0.7934202917
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
127669.9%
 
27519.0%
 
3266.6%
 
592.3%
 
492.3%
 
ValueCountFrequency (%) 
127669.9%
 
27519.0%
 
3266.6%
 
492.3%
 
592.3%
 
ValueCountFrequency (%) 
592.3%
 
492.3%
 
3266.6%
 
27519.0%
 
127669.9%
 

Walc
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2911392405063293
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.287896592
Coefficient of variation (CV)0.5621206122
Kurtosis-0.7908450619
Mean2.291139241
Median Absolute Deviation (MAD)1
Skewness0.6119599829
Sum905
Variance1.658677633
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
115138.2%
 
28521.5%
 
38020.3%
 
45112.9%
 
5287.1%
 
ValueCountFrequency (%) 
115138.2%
 
28521.5%
 
38020.3%
 
45112.9%
 
5287.1%
 
ValueCountFrequency (%) 
5287.1%
 
45112.9%
 
38020.3%
 
28521.5%
 
115138.2%
 

health
Real number (ℝ≥0)

Distinct count5
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5544303797468353
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.390303391
Coefficient of variation (CV)0.3911466094
Kurtosis-1.014078286
Mean3.55443038
Median Absolute Deviation (MAD)1
Skewness-0.4946035682
Sum1404
Variance1.93294352
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
514637.0%
 
39123.0%
 
46616.7%
 
14711.9%
 
24511.4%
 
ValueCountFrequency (%) 
14711.9%
 
24511.4%
 
39123.0%
 
46616.7%
 
514637.0%
 
ValueCountFrequency (%) 
514637.0%
 
46616.7%
 
39123.0%
 
24511.4%
 
14711.9%
 

absences
Real number (ℝ≥0)

ZEROS

Distinct count34
Unique (%)8.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.708860759493671
Minimum0
Maximum75
Zeros115
Zeros (%)29.1%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q38
95-th percentile18.3
Maximum75
Range75
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.003095687
Coefficient of variation (CV)1.401872637
Kurtosis21.71914972
Mean5.708860759
Median Absolute Deviation (MAD)4
Skewness3.67157895
Sum2255
Variance64.04954058
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
011529.1%
 
26516.5%
 
45313.4%
 
6317.8%
 
8225.6%
 
10174.3%
 
14123.0%
 
12123.0%
 
382.0%
 
771.8%
 
1671.8%
 
1851.3%
 
551.3%
 
2041.0%
 
2230.8%
 
1330.8%
 
130.8%
 
930.8%
 
1130.8%
 
1530.8%
 
2310.3%
 
2410.3%
 
2110.3%
 
2510.3%
 
5610.3%
 
Other values (9)92.3%
 
ValueCountFrequency (%) 
011529.1%
 
130.8%
 
26516.5%
 
382.0%
 
45313.4%
 
551.3%
 
6317.8%
 
771.8%
 
8225.6%
 
930.8%
 
ValueCountFrequency (%) 
7510.3%
 
5610.3%
 
5410.3%
 
4010.3%
 
3810.3%
 
3010.3%
 
2810.3%
 
2610.3%
 
2510.3%
 
2410.3%
 

G1
Real number (ℝ≥0)

Distinct count17
Unique (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.90886075949367
Minimum3
Maximum19
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB

Quantile statistics

Minimum3
5-th percentile6
Q18
median11
Q313
95-th percentile16
Maximum19
Range16
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.319194672
Coefficient of variation (CV)0.3042659307
Kurtosis-0.6938295024
Mean10.90886076
Median Absolute Deviation (MAD)3
Skewness0.2406132434
Sum4309
Variance11.01705327
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105112.9%
 
84110.4%
 
11399.9%
 
7379.4%
 
12358.9%
 
13338.4%
 
9317.8%
 
14307.6%
 
15246.1%
 
6246.1%
 
16225.6%
 
1782.0%
 
1882.0%
 
571.8%
 
1930.8%
 
410.3%
 
310.3%
 
ValueCountFrequency (%) 
310.3%
 
410.3%
 
571.8%
 
6246.1%
 
7379.4%
 
84110.4%
 
9317.8%
 
105112.9%
 
11399.9%
 
12358.9%
 
ValueCountFrequency (%) 
1930.8%
 
1882.0%
 
1782.0%
 
16225.6%
 
15246.1%
 
14307.6%
 
13338.4%
 
12358.9%
 
11399.9%
 
105112.9%
 

G2
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count17
Unique (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.713924050632912
Minimum0
Maximum19
Zeros13
Zeros (%)3.3%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile5
Q19
median11
Q313
95-th percentile16.3
Maximum19
Range19
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.76150466
Coefficient of variation (CV)0.3510856192
Kurtosis0.6277056434
Mean10.71392405
Median Absolute Deviation (MAD)2
Skewness-0.431645389
Sum4232
Variance14.1489173
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
95012.7%
 
104611.6%
 
124110.4%
 
13379.4%
 
11358.9%
 
15348.6%
 
8328.1%
 
14235.8%
 
7215.3%
 
5153.8%
 
6143.5%
 
0133.3%
 
16133.3%
 
18123.0%
 
1751.3%
 
1930.8%
 
410.3%
 
ValueCountFrequency (%) 
0133.3%
 
410.3%
 
5153.8%
 
6143.5%
 
7215.3%
 
8328.1%
 
95012.7%
 
104611.6%
 
11358.9%
 
124110.4%
 
ValueCountFrequency (%) 
1930.8%
 
18123.0%
 
1751.3%
 
16133.3%
 
15348.6%
 
14235.8%
 
13379.4%
 
124110.4%
 
11358.9%
 
104611.6%
 

G3
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count18
Unique (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.415189873417722
Minimum0
Maximum20
Zeros38
Zeros (%)9.6%
Memory size3.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q18
median11
Q314
95-th percentile17
Maximum20
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.581442611
Coefficient of variation (CV)0.4398808535
Kurtosis0.4034208131
Mean10.41518987
Median Absolute Deviation (MAD)3
Skewness-0.732672353
Sum4114
Variance20.9896164
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105614.2%
 
114711.9%
 
0389.6%
 
15338.4%
 
8328.1%
 
13317.8%
 
12317.8%
 
9287.1%
 
14276.8%
 
16164.1%
 
6153.8%
 
18123.0%
 
792.3%
 
571.8%
 
1761.5%
 
1951.3%
 
410.3%
 
2010.3%
 
ValueCountFrequency (%) 
0389.6%
 
410.3%
 
571.8%
 
6153.8%
 
792.3%
 
8328.1%
 
9287.1%
 
105614.2%
 
114711.9%
 
12317.8%
 
ValueCountFrequency (%) 
2010.3%
 
1951.3%
 
18123.0%
 
1761.5%
 
16164.1%
 
15338.4%
 
14276.8%
 
13317.8%
 
12317.8%
 
114711.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

schoolsexageaddressfamsizePstatusMeduFeduMjobFjobreasonguardiantraveltimestudytimefailuresschoolsupfamsuppaidactivitiesnurseryhigherinternetromanticfamrelfreetimegooutDalcWalchealthabsencesG1G2G3
0GPF18UGT3A44at_hometeachercoursemother220yesnononoyesyesnono4341136566
1GPF17UGT3T11at_homeothercoursefather120noyesnononoyesyesno5331134556
2GPF15ULE3T11at_homeotherothermother123yesnoyesnoyesyesyesno432233107810
3GPF15UGT3T42healthserviceshomemother130noyesyesyesyesyesyesyes3221152151415
4GPF16UGT3T33otherotherhomefather120noyesyesnoyesyesnono432125461010
5GPM16ULE3T43servicesotherreputationmother120noyesyesyesyesyesyesno54212510151515
6GPM16ULE3T22otherotherhomemother120nonononoyesyesyesno4441130121211
7GPF17UGT3A44otherteacherhomemother220yesyesnonoyesyesnono4141116656
8GPM15ULE3A32servicesotherhomemother120noyesyesnoyesyesyesno4221110161819
9GPM15UGT3T34otherotherhomemother120noyesyesyesyesyesyesno5511150141515

Last rows

schoolsexageaddressfamsizePstatusMeduFeduMjobFjobreasonguardiantraveltimestudytimefailuresschoolsupfamsuppaidactivitiesnurseryhigherinternetromanticfamrelfreetimegooutDalcWalchealthabsencesG1G2G3
385MSF18RGT3T22at_homeotherothermother230nonoyesnoyesyesnono533134210910
386MSF18RGT3T44teacherat_homereputationmother310noyesyesyesyesyesyesyes4432257656
387MSF19RGT3T23servicesothercoursemother131nononoyesnoyesyesno5421250750
388MSF18ULE3T31teacherservicescoursemother120noyesyesnoyesyesyesno4341110798
389MSF18UGT3T11otherothercoursemother221nononoyesyesyesnono1111150650
390MSM20ULE3A22servicesservicescourseother122noyesyesnoyesyesnono55445411999
391MSM17ULE3T31servicesservicescoursemother210nononononoyesyesno2453423141616
392MSM21RGT3T11otherothercourseother113nononononoyesnono55333331087
393MSM18RLE3T32servicesothercoursemother310nononononoyesyesno4413450111210
394MSM19ULE3T11otherat_homecoursefather110nonononoyesyesyesno3233355899